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SUMMARY 

The South Caucasus, situated between the Black 
and Caspian Seas, geographically links Europe with 
the Near East and has served as a crossroad for hu- 
man migrations for many millennia [1-7]. Despite a 
vast archaeological record showing distinct cultural 
turnovers, the demographic events that shaped the 
human populations of this region is not known 
[8, ]. To shed light on the maternal genetic history 
of the region, we analyzed the complete mitochon- 
drial genomes of 52 ancient skeletons from pre- 
sent-day Armenia and Artsakh spanning 7,800 years 
and combined this dataset with 206 mitochondrial 
genomes of modern Armenians. We also included 
previously published data of seven neighboring pop- 
ulations (n = 482). Coalescence-based analyses sug- 
gest that the population size in this region rapidly 
increased after the Last Glacial Maximum ca. 
18 kya. We find that the lowest genetic distance in 
this dataset is between modern Armenians and the 
ancient individuals, as also reflected in both network 
analyses and discriminant analysis of principal com- 
ponents. We used approximate Bayesian computa- 
tion to test five different demographic scenarios 
explaining the formation of the modern Armenian 
gene pool. Despite well documented cultural shifts 
in the South Caucasus across this time period, our 
results strongly favor a genetic continuity model in 
the maternal gene pool. This has implications for in- 
terpreting prehistoric migration dynamics and cul- 
tural shifts in this part of the world. 


RESULTS AND DISCUSSION 

In this study, we present 206 mitochondrial genome sequences 
from three sub-populations of modern Armenians and 44 (plus 
eight previously published) mtDNA genomes from ancient indi- 
viduals excavated in Armenia and Artsakh (Figure 1; Table SI). 
The calibrated radiocarbon dates of the ancient samples ranged 
between 300 and 7,81 1 years BP, with the majority being Bronze 
Age individuals, 3,000 to 4,000 years old (Table SI). 

Shotgun sequencing data from all 44 ancient DNA extracts 
showed increased deamination damage rates at both 5' and 3' 
ends of sequencing reads compared to the revised Cambridge 
reference sequence (rCRS) reference mitochondrial sequence. 
The C^T transition rates at the first position of sequenced 
DNA fragments were between 8.9%-43.7%, indicating that the 
profiled DNA molecules were of ancient origin (Table SI). The 
estimated levels of DNA contamination were <8%, with an 
average of 1.3% across the entire ancient dataset (Table SI). 
Three pairs among the 44 ancient individuals had pairwise iden- 
tical mitochondrial genome sequences (Table S2). Combined 
with archaeological data suggesting a close relationship (the 
same site and grave locations), these identical mtDNA se- 
quences indicated a maternal relationship, and we therefore 
excluded data from one individual of each pair in most down- 
stream analyses. Summary statistics and genetic diversity 
values for all groups are shown in Table S3. Negative Tajima’s 
D values, observed for all four groups, could suggest a recent in- 
crease in population size [10]. 

The major mtDNA haplogroup frequencies in the four groups 
(three modern and one ancient) are presented in Figure 2, and 
qualitatively it is clear that the modern Armenian groups and 
the ancient group display obvious similarities. The three Neolithic 
samples (arm7, arm9, and arm39; ca. 7,800 years BP) in our 
dataset have mitochondrial haplogroups H and I, which have 
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Figure 1. Study Area 

Maps of the Near East (insert) and Armenia with 
sampling and origin areas of ancient and modern 
individuals, respectively. The legend shows the 
number of samples and the approximate time 
period covered by each cultural group. EBA, early 
Bronze Age; MBA, middle Bronze Age; LBA, late 
Bronze Age; EIA, early Iron Age; LIA, late Iron Age. 
See also Table SI. 
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previously been associated with the Neolithic expansion of 
farming cultures from the Near East [11]. Interestingly, hap- 
logroup I, which first seems to appear in Europe during the 
Late Neolithic (ca. 4,000 years BP) [11], is observed in a Neolithic 
individual (arm39) from the South Caucasus, dated to 7,800 
years BP. This early presence could reflect the geographic prox- 
imity of the South Caucasus to the putative place of haplogroup I 
origin in Southwest Asia [12, 13]. 

A correspondence analysis based on extended haplogroup 
frequencies of the ancient group, modern Armenians, and 
comparative populations (Africa, Europe, Caucasus, Near East, 
Central Asia, and East Asia) (Table S4) is presented in Figure SI . 
The plot clearly shows the clustering of the ancient group 
together with the modern European, Armenian, and Caucasian 
populations. We observe none of the typical East Eurasian 
mtDNA lineages (A, C, D, F, G, and M) among the ancient individ- 
uals, and only one individual with haplogroup D is present in the 
modern Armenian maternal gene pool (Artsakh). As such, the 
archaeologically and historically attested migrations of Central 
Asian groups (e.g., Turks and Mongols) into the South Caucasus 
[14, 15] do not seem to have had a major contribution in the 
maternal gene pool of Armenians. Both geographic (moun- 
tainous area) and cultural (Indo-European-speaking Christians 
and Turkic-speaking Muslims) factors could have served as 
barriers for genetic contacts between Armenians and Muslim 
invaders in the 1 1 th — 1 4 th centuries CE. The same pattern was 
observed using Y chromosome markers in geographically 
diverse Armenian groups [16, 17]. An absence of East Eurasian 


mtDNA lineages in Armenia and Georgia 
was previously shown by Schonberg 
et al. [18], whereas in neighboring 
Turkic-speaking groups (Azeri and Turks), 
haplogroups A, C, D, F, G, and M7 are 
indeed present [18, 19], perhaps brought 
in by the Oghuz and Mongol migrations 
ca. 1,000 years ago. 

We constructed a multidimensional 
scaling (MDS) plot based on the F sr ge- 
netic distance matrix to visualize the ge- 
netic differentiation between our sample 
groups and seven other populations 
from the South Caucasus and the Near 
East for which complete mtDNA genome 
sequences data were available (Figure 3). 
The F S t values show that the ancient indi- 
viduals are genetically closest to the mod- 
ern Armenian group from Erzrum and to 
modern Georgians (Table S5), and on 
the MDS plot they also cluster together with the three modern 
Armenian groups from Erzrum, Artsakh, and Ararat. The genetic 
distances between the ancient group and most of the included 
modern populations (F sv - values ranging from 0 to 0.0145) were 
not significantly different, indicating possible close genetic ties 
between ancient and most modern populations of the South 
Caucasus and northern parts of the Near East. We note that 
the sample sizes of several of the previously published compar- 
ative datasets were relatively small (n < 30), which could 
potentially cause a slight misrepresentation of the true genetic 
diversity of the source population. This is evident from the previ- 
ously published smaller dataset of the Armenians [18], which 
does not capture the same diversity of mtDNA lineages (e.g., 
haplogroups R and I) as we document here in our larger dataset. 

The genetic similarity between the ancient group and modern 
Armenians is also reflected in a TCS network of mtDNA haplo- 
types, a discriminant analysis of principal components, and the 
maximum-parsimony phylogenetic tree presented in Figure S2 
and Data SI . 

A Bayesian skyline plot (BSP) based on all modern and ancient 
mitochondrial genomes analyzed together revealed four putative 
demographic events (geometric mean of 4.4 with 95% highest 
posterior density intervals between 4 and 6 as obtained from 
the results of extended skyline plot analysis). The plot indicates 
a small but noticeable decrease in the effective female popula- 
tion size (N e ) around 25 kya during the Last Glacial Maximum 
(LGM), which is followed by a rapid (roughly 10-fold) population 
increase until around 10 kya (Figure 4A). The same result was 
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Figure 2. mtDNA Haplogroups 

The figure shows the modern groups including three Armenian (Ararat, Artsakh, 
and Erzrum), Central Asian (Tajikistan, Kyrgyzstan, and Uzbekistan combined), 
and European (Italy-Tuscany, Poland, and Bulgaria combined) populations. 
“Other” denotes the combined frequencies of mtDNA haplogroups with less 
than 1 % in Europe and Central Asia. Extended haplogroup frequencies and 
references for the European and Central Asian populations are presented in the 
Table S4. See also Data SI , Figure SI as well as Tables S2 and S3. 
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Figure 3. MDS Plot 

The MDS analysis was based on F sr genetic distances using whole mtDNA 
sequences of the ancient group (blue), modern Armenians (blue), and neigh- 
boring populations from the Near East and the South Caucasus (green). See 
also Table S5 and Figure S2. 


observed when analyzing the data without partitioning the 
mtDNA sequences into mutation-dependent segments. This de- 
mographic trajectory is in accordance with previously published 
results based on data from European Mesolithic and Paleolithic 
individuals [20]. Interestingly, N e appears to be declining around 
5 kya, although the large confidence interval makes this conclu- 
sion tentative. This result was not observed in previous studies 
based on smaller samples size and modern data alone [18]. 
Interestingly, the timing of this putative decline coincides with 
the formation of complex societies during the Bronze Age in 
the region [21]. This could have increased susceptibility to dis- 
eases such as plague, which we know was present in both Cen- 
tral Asia and Europe during the early Bronze Age [22]. Another 
possibility is that the society formation of Bronze Age popula- 
tions could have reduced the effective female population size 
without affecting the census population sizes. Factors like pop- 
ulations size fluctuations, increased selection, variation in family 
size, and changing population sub-structuring can all affect the 
estimates of effective population size [23]. However, it has previ- 
ously been noted that recent population declines on BSP plots 
should be interpreted with caution as it may be an artifact of pop- 
ulation structure [24]. 

Furthermore, we used approximate Bayesian computation 
(ABC) analyses to test five possible demographic model sce- 
narios (Figure 4B), simulating 1,000,000 datasets from each 
model. For the modern group, we used the combined (n = 206) 
modern Armenian population (see the STAR Methods for the de- 
tails of ABC analysis and rationale behind the model choices). 
The cross-validation of the ABC model selection is summarized 
in Table S3, showing that we can easily distinguish between the 
genetic continuity scenario (model 1) and the rest. We used two 
statistical tests, marginal density p value and Tukey depth 
p value, to assess the fit of our five models to the observed 
data. All models show high values for both statistics, indicating 
a good fit for all of them to the observed data (Table S3). 

Based on comparison of the marginal densities, the analysis 
favor model 1 (posterior probability of 89% and Bayes factor of 
8.1), which assumes genetic continuity between the ancient 


group and the modern Armenians (Table S3). This result sug- 
gests that there were no major genetic shifts in the mtDNA 
gene pool in South Caucasus across the last 7,800 years. 

Using genetic data of modern Armenians, Haber et al. sug- 
gested that the Armenian gene pool was formed as a result of 
admixture events happening ca. 4,500 years BP [25]. Our ancient 
DNA (aDNA) data suggest that at least the maternal gene pool in 
the South Caucasus has been very stable and was largely 
formed before these events. A scenario of genetic continuity is 
supported by two previous studies that included low-coverage 
genomic data from a few ancient individuals from the South Cau- 
casus: Allentoft et al. observed genetic similarities between 
Bronze Age individuals (ca. 3,500 years BP) and modern 
Armenians [26], and Lazaridis et al. showed similarity between 
Chalcolithic (ca. 6,000 years BP) and Bronze Age (ca. 3,500 
years BP) individuals excavated in Armenia [7]. Moreover, Jones 
et al. presented results implying that such continuity might 
extend even further back in time: it appears that Upper Paleo- 
lithic Caucasus hunter-gatherers and Mesolithic individuals 
from the South Caucasus (Georgia) are genetically close to mod- 
ern Caucasian groups, albeit also displaying their own genetic 
component [3]. The two hunter-gatherer individuals from this 
study had variants of mtDNA haplogroups H and K, which 
have typically been associated with later Neolithic times. 

Our results have implications for how the known cultural shifts 
in the South Caucasus are interpreted. It appears that during the 
last eight millennia, there were no major genetic turnovers in the 
female gene pool in the South Caucasus, despite multiple well- 
documented cultural changes in the region [27, 28], This is in 
contrast to the dramatic shifts of mtDNA lineages occurring in 
Central Europe during the same time period, which suggests 
either a different mode of cultural change in the two regions or 
that the genetic turnovers simply occurred later in Europe 
compared to the South Caucasus. More data from earlier Meso- 
lithic cultures in the South Caucasus are needed to clarify this. 
During the highly dynamic Bronze Age and Iron Age periods, 
with the formation of complex societies and the emergence of 
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Figure 4. Population Demographics 

(A) Bayesian skyline plot. Values on the y axis represent the effective female 
population size (N e ) x generation time (g). 

(B) Schematic representation of the five demographic models used for coa- 
lesced ABC simulations. Population sizes of modern Armenians (N M ), ancient 
group (Na), “ghost" Asian population (N G ), first bottleneck (20 kya; N B ), second 
bottleneck (40 kya; N b ) are shown. 

For the details of ABC analysis and rationale behind the model choices, see 

STAR Methods. See also Table S3. 


Due to the lack of available ancient and modern mtDNA ge- 
nomes from other regions of the South Caucasus, we have 
here used Armenians as a representative group of the region. 
Considering the low and in many cases non-significant genetic 
differences that we observe between populations of the South 
Caucasus, one would expect to observe a somewhat similar 
pattern of matrilineal genetic continuity in other parts of this re- 
gion, i.e., Georgia, Azerbaijan, and Armenian Highland (partially 
modern day Eastern Turkey and North-West Iran). Future studies 
should, however, prioritize to expand the sampling of both mod- 
em and ancient populations in the whole region to uncover the 
geographic and temporal extent of this genetic continuity signal. 

STAR*METHODS 

Detailed methods are provided in the online version of this paper 
and include the following: 

• KEY RESOURCES TABLE 

• CONTACT FOR REAGENT AND RESOURCE SHARING 

• EXPERIMENTAL MODEL AND SUBJECT DETAILS 
O Modem samples 

O Ancient samples 
O Archaeological Sites 

• METHOD DETAILS 
O Modem samples 
O aDNA extraction 

O NGS library preparation and sequencing of ancient 
samples 

O Sequencing of aDNA 

• QUANTIFICATION AND STATISTICAL ANALYSIS 

O Carbon dating 
O Basic bioinformatics 
O Authentication of aDNA 
O Haplogroups 

O Phylogenetic tree construction 
O Effective population size 
O Demographic model testing 

• DATA AND SOFTWARE AVAILABILITY 

SUPPLEMENTAL INFORMATION 

Supplemental Information includes two figures, five tables, and one data file 
and can be found with this article online at http://dx.doi.Org/10.1016/j.cub. 
2017.05.087. 


distinctive cultures such as Kura-Araxes, Trialeti-Vanadsor, 
Sevan-Artsakh, Karmir-Berd, Karmir-Vank, Lchashen-Metsa- 
mor, and Urartian, we cannot document any changes in the fe- 
male gene pool. This supports a cultural diffusion model in the 
South Caucasus, unless the demographic changes were heavily 
male biased, as was most likely the case in Europe during the 
Bronze Age migrations [29, 30]. However, genome-wide data 
from the few Bronze Age individuals published so far from the 
South Caucasus also support a continuity scenario [26], Another 
possibility is that any gene flow into the South Caucasus 
occurred from groups with a very similar genetic composition, 
facilitating only subtle genetic changes that are not detectable 
with the current datasets. 
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Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Morten E. 
Allentoft (meallentoft@snm.ku.dk). 

EXPERIMENTAL MODEL AND SUBJECT DETAILS 

Modern samples 

Saliva samples were collected from a total of 206 unrelated self-identified ethnic Armenians, whose ancestors (at grandparental level) 
originated from Artsakh (n = 74, individuals sampled in Stepanakert), Ararat (n = 61 , individuals sampled in villages of the Ararat Val- 
ley), and Erzrum - modern Turkey (n = 71 , individuals sampled in Yerevan). All subjects provided written informed consent for the 
collection of samples and subsequent analysis. Sampling was conducted by the scientific staff following an approval from the ethical 
board (IRB-00004079) at the Institute of Molecular Biology, NAS, Armenia. 

Ancient samples 

We sampled 44 ancient human skeletons from Armenia and Artsakh according to established aDNA guidelines [54] and sampling 
permissions were obtained from respective state institutions. A total of 19 archaeological sites are represented, covering large parts 
of Armenia as well as Artsakh (Figure 1), and estimated to be between 300-7800 years old based on contextual dating of artifacts. 
This time span is accompanied by at least seven well-defined cultural transitions: Neolithic, Chalcolithic, Kura-Araxes, Trialeti- 
Vanadzor 2, Lchashen-Metsamor, Urartian and Armenian Classical/Medieval (Figure 1). 

Archaeological Sites 
Aknashen 

The site of Aknashen-Khatunarkh is located in the Ararat Plain, in the basin of the Sev dzhur(Metsamor), at an altitude of 838 m, in the 
province (marz) of Armavir (6 km south of Echmiadzine, on the northeast periphery of the village of Aknashen). The site is an artificial 
hill (blur), circular in plan, 1 00 m in diameter (a surface area of about 0.8 hectares), with a flat top rising about 3.5 m above the plain. 
The excavations of the site have been conducted since 2004 by the Armenian-French joint expedition. 

The site belongs to the Late Neolithic “Aratashen-Shulaveri-Shomutepe” culture and dates back to the first half of the VI millennium 
BC. Fluman remains occur both from a cultural layer of the synchronous deliberately committed burial (Tr.6, UF 11, F.15) and of 
household waste (unburied remains of a newborn, Tr.4, UF 7a), and intrusive burial of the end of Early Bronze - the beginning of 
the Middle Bronze Age (second half of the III millennium BC; Tr.7, UF 5, F2). 

Akunk 

The cemetery is located in the area neighboring the eastern part of the village Akunq, Gegharquniq region. E. Khanzadyan carried out 
excavations here in 1970. Barrows were excavated, which contained one and more tombs with stone-cist and pit constructions. 
Tombs are related to the 9-8 th centuries BC. 

Artsvakar 

The necropolis is located at the northern edge of the Artsvaqar region of the city Gavar. The Iron Age (11-9 centuries BCE) remains 
are found in the necropolis occupying about 0.1 squared kilometers. The tombs are earth and stone tumuli, 0.3-0. 5 m high and 
having been built at ground level. Systematic archaeological excavations at the site have been conducted since 1992, and over 
17 tombs dating from the Late Bronze Age (18 - 16 centuries BC) to Iron Age (12/1 1 -9/8 centuries BC) period have been excavated 
there. 
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Azatan 

The site is situated about 2 km south-west from the village Azatan (Shirak region) and takes an area of 1 ,5 km 2 . It consists of a fortress, 
a settlement and 3 cemeteries, spread around them. Since 201 1 till today it is studied by the Local-Lore Museum of Shirak and the 
archaeological expedition of the university Halle. The cemetery is dated to the Early Iron Age-Hellenistic periods. Pit, cist, stone-cist, 
catacomb, rock-cut tombs have been excavated. Pit burials are individual burials: The deceased are on left and right sides with the 
face to the east. Catacomb and rock-cut ones are sequential burials of some persons with traces of decarnation and human sacrifice. 
Dari Glukh 

The necropolis is located close to the city Gavarand is distributed on northern side of the Gavar- Noratus highway. The Late Bronze/ 
Early Iron Age remains are found in the necropolis occupying about 0.2 km squared. Systematic archaeological excavations at the 
site have been conducted since 2003, and over 12 tombs dating to the Late Bronze/Early Iron Age (15-11 centuries BC) period have 
been excavated. 

Godedzor 

The settlement Nerkin Godedzor is situated 1 .5 km west of the village Angeghakot, within the Vorotan river gorge, on the left bank of 
the river. Systematic excavations of the site have been realized during 2004-201 4 by an Armenian-French team. The lower level of the 
site deals with the final stage of the Chalcolithic - transition to the Early Bronze Age (Kura-Araxes culture) and is dated to ca. 3600- 
3400 BC. The results of excavations demonstrate that it was a provisional settlement to be visited during summer season: in winter 
the population moved back to the lowlands, very probably situated on the southern shores of the Lake Urmia (Iran). The burials are 
within the settlement, in the pits made in clay-plastered floors of wooden-roof yards. 

Kanagegh 

The necropolis is located not far from Yeranos village, on southern shore of Lake Sevan. Distributed on both sides of the Sevan - 
Vardenis Highway. The Late Bronze Age - Iron Age remains are found in the south-western part of the necropolis, in a cemetery 
occupying about 0.3 km squared. Systematic archaeological excavations at the site have been conducted since 1982, and over 
1 5 tombs dating from the Late Bronze Age (18-16 centuries BC) to Iron Age (1 2/1 1 - 9/8 centuries BC) period have been excavated 
in 1999-2002. 

Kaps 

The Early Bronze Age cemetery is situated in the northern part of the village Kaps (Shirak region), on the south-eastern slope of the 
eastern height of the homonymous reservoir. During the constructive works of the reservoir, the slope of ca. 60 m width was cut in 
order to open a new road and both 2 sides of the cutting destroyed stone-cist tombs became visible. The excavated tomb is 2,1 x 
2,1 x 1 ,00 m in size. In the eastern wall of the tomb there was an entry. Three burials have been made: skeletons of two of them were 
collected in the north-eastern corner; the third one was on the left-side, along the southern wall. A finger-ring of bronze, as well as two 
black-burnished vessels were found in the tomb. 

Karkar 

The fortress-city Karkar is situated between the cities Stepanakert and Shushi, on the cape of the right bank of the river Karkar. The 
area of the site is 40-45 ha. The settlement was protected by a cyclopean masonry from the south, and by a three-stepped ground 
barrier and water-ditch from the north. The excavations attested that the city was settled since the 8-7 th centuries BC till the 14 th 
century AD. At the beginning of the 1 8 th century a small fort-post was founded on the hill of the citadel, in the ruins of the round tower 
of which a secondary burial was found, carried out after the destroying of the tower, probably in the 1 8 th century AD. A bone from this 
burial was given for expertise. 

Karashamb 

The Karashamb necropolis is situated 1 .6 km south of village Karashamb, in Kotayk Province, on one of the terraces of the right bank 
of Hrazdan Gorge (lat. -40 23 49.3 N, long. -44 35 28.8 E). Today the preserved segment of the necropolis occupied an area of about 
3.5 hectares and is only a fraction of the previous larger cemetery. Systematic archaeological excavations at the site have been con- 
ducted since 1981 - 1984, 2009 - 2015, and over 1500 tombs dating from the Middle Bronze through Iron II Ages period (the last 
quarter of the III millennium BCE to the second quarter of the 1st millennium BC). 

Burials NN 50, 462 and 750 were excavated during archaeological field works in 2010 and 2013. These are attributed to the middle 
(second) phase (20 - 18 centuries BC) of theTrialeti-Vanadzor culture, which is dated from the last quarter of the III millennium BC to 
the early part of the II millennium BC (23-16 centuries BC). 

Karintak 

This cave is located in the Shushi Region (Artsakh), in the midst of the lush Karintak Forest (which derives its name from a nearby 
village). The name ‘Karintak’ is apt as it means ‘beneath the rock’. The Grid Reference for the location is N 39.74320° E 46.76611° 
and the elevation is approximately 1 ,400 m. 

Karintak cave contains two separate cave passages (termed Cave 1 and Cave 2). Cave 1 is considerably longer, extending for 
approximately 42 m in north-east direction (inward from the entrance). The passage continues deeper, beyond this point, but is 
too narrow to pass through. Cave 1 actually narrows and widens (constriction and necking) several times along its length, producing 
a set of small ‘sub-chambers’. Cave 2 extends in from the entrance in a westerly direction for only c. 10 m before turning 90° in a 
southerly direction for another c. 5 m. 

Karintak cave is still fairly wet today. Stalactites, stalagmites and flowstone were observed in the rear of Cave 1 . 

Archaeological survey of the site has been conducted since 201 4, a 2 m long x 1 m wide x 1 .8 m deep test pit approximately 30 m 
from the cave mouth was excavated. The human tooth used for this study was excavated in 2015 in the Cave 1. 
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Metsamor 

The cemetery of Metsamor is situated in the south-eastern part of the city Metsamor. In 201 1 one destroyed tomb was excavated 
here. It is located not far from the highway Yerevan-Armavir, on a natural hill. Excavations demonstrate that an Early Iron Age tomb 
was on the natural hill, in which later - a new burial was done to be destroyed during the construction of the city. 

Nerkin Getashen 

The necropolis is located at Nerqin Getasehen village, and is distributed on both sides of the Yerevan -Sevan-Vardenis highway. The 
Middle Bronze Age to Iron Age remains are found in the southern part of the necropolis, in a cemetery occupying about 0.7 squared 
kilometers. Systematic archaeological excavations at the site have been conducted since 1905, and over 67 tombs dating from the 
Middle Bronze Age (22/21 - 17/16 centuries BC) through Iron Age (13/12 - 9/8 centuries BC) period have been excavated. 

The Middle Bronze Age is represented by 7 tombs dated to the 19-22 centuries BC and belong to the Trialeti-Vanadzor and 
Sevan-Artsakh cultures (ll-lll phases of the Middle Bronze Age). The tombs are earth and stone tumuli, 0.5-1 .5 m high and had 
been built at ground level. 

The Late Bronze Age -Iron Age is represented by 58 tombs. These are dated to the 15/14 - 9/8 cc. BCE and belong to the 
Lchashen-Metsamor culture. The tombs are earth and stone tumuli, 0.3-1 .5 m high, having been built at ground level and excavated 
in 1989-1991. 

Nerqin Naver 

Nerqin Naveri necropolis is situated near Parpi, Oshakan, Voskevaz villages in Aragatsotn Province around 1100 m above sea level 
and was discovered in 1978. On the east side of burial field linearly located 5 tumuli. Tumulus N 1 was excavated during archaeo- 
logical field works between 2002-2003 and had multiple layers. In the top layer (80 cm) pieces of pottery and a horse iron bridle 
was found while in the second layer (80-120 cm) human hand and foot phalanges were discovered. In the third layer (120- 
200 cm) in the north-eastern part of the burial, human long bones, phalanges and teeth were excavated which didn’t have anatomical 
positioning. According to the anthropologists, these bones belonged to two individuals. Pieces of jewelry, silverware and sacrificed 
animal bones were also discovered in this layer. In deeper layers (200-250 cm) scattered animal bones of many species, jewelry and 
pottery was present. At the bottom of this layer human teeth from both children and adults were found. This burial is attributed to the 
Iron Age. 

Sevan 

The excavation was conducted in 1956 by an expedition of the Historical Museum of the Academy of Sciences headed by H. Mnat- 
sakanyan. On the dried territories of the lake Sevan (Tsamakaberd) more than 20 burials had been excavated. Tomb 19 is oriented to 
the north-east to south-west; the chamber is a stone cist with longer sides (two slabs), and shorter (one stone slab) ones from north- 
ern and southern parts. The tomb was covered by two stone slabs, one of which was demolished with waters of the lake. Inner part of 
the chamber was covered with limestone sediment solidified thickness of about 30 cm. Bones here were well preserved more than in 
the other graves. The skeleton was lying on the left side with bent arms and legs. The excavated material is dated to 12th cent BC. 
Sotk 

There are two main archaeological site in Sotk: Sotk 1 and Sotk 2, excavated durint 2011-2015 by an Armenian-German team. The 
site Sotk 1 is located in the southern part of the village Sotk, Gegharkunik Region, RA, and is a fortress-settlement with intrusive 
tombs within the settlement. The site comprises on the whole an area of ca. 1 .5 ha. The fortress-settlement dates from the Achae- 
menide and Artasheside periods (ca. 500-1 BC), the intrusive tombs without inventory are surely Medieval (perhaps ca. 500-1500 
BC). Systematic archaeological excavations at the site have been conducted during 201 4 in various parts of the fortress-settlement. 
We deal with a road-guiding fortress in which burials have been made during Medieval period, when the fortress was abandoned. 

The site Sotk 2 is located in northern part of the village Sotk, Gegharkunik Region, RA, and is a fortress-settlement with tombs 
within the settlement. The site comprises on the whole an area of ca. 2 ha. The settlement dates from the Early to Late Bronze 
Ages (ca. 3000-1200 BC), the fortress from the Middle and Late Bronze Ages (ca. 1600-1200 BC). The intramural pit-graves are 
contemporary with the settlement. Systematic archaeological excavations at the site have been conducted during 2011-2015 in 
various parts of the fortress-settlement guiding the road to the gold mines of Sotk and the road to Karvachar. The sample derives 
from the intramural child burial belonging to the Early Bronze Age. 

METHOD DETAILS 

Modern samples 

Samples from the modern Armenian individuals (n = 206) were sequenced at the Genetics Laboratory of Institute of Biological Prob- 
lems of the North, Russian Academy of Sciences, Magadan, Russia. 

Genomic DNA was extracted using a standard phenol/chloroform procedure. The entire mtDNA was amplified in 1 1 overlapping 
PCR fragments, using a set of primers with matching annealing temperatures as described in detail by Torroni et al., 2001 [55]. After 
PCR, the fragments were purified using the DiatomTM DNA Clean-Up kit (Isogene Laboratory), and Cycle Sequencing was performed 
by application of BigDye Terminator v3.1 chemistry (Applied Biosystems), using set of 32 nested primers specifically designed for this 
protocol [55]. An ABI 3500xL sequencer was used for separation of the sequencing ladders. Complete mtDNA sequences were 
aligned and assembled using the software SeqScape 2.7 (Applied Biosystems). The obtained complete mtDNA sequences were 
compared to the revised Cambridge reference sequence (rCRS) [56] and assigned to haplogroups based on human mtDNA phylog- 
eny presented in mtDNA tree Build 17 (18 February 2016) (http://www.phylotree.org). 
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A total of 482 whole mtDNA sequences from seven populations from the South Caucasus and the Near East were also used as a 
comparative dataset [18, 19, 31], 

aDNA extraction 

The ancient DNA (aDNA) work was conducted in dedicated aDNA clean-room facilities at Centre for GeoGenetics, Natural History 
Museum, University of Copenhagen according to strict aDNA standards [54, 57]. The vast majority of ancient samples (41 out of 
44) were teeth, but one petrous bone and two postcranial bones were also used (Table SI). We targeted the endogenous DNA 
rich outer layer of the tooth roots for ancient DNA extraction [58]. 

All the samples were mechanically cleaned to minimize contamination from modern DNA by removing the surface area with 
diamond-dust-coated disk using a drill. For the teeth samples, only the outer cementum-rich part of the roots was used for DNA 
extraction by removing the crown and root dentine with a cutting disk and pointy drili-bit respectively [58]. For the few petrous 
bone samples, we targeted the otic capsule region which was previously shown to have the highest levels of endogenous DNA 

[59] , The drilled bone material (ranging from 200 to 400 mg) was briefly digested (predigestion step) by incubating in digestion buffer 
(4.65 mL 0.5 M EDTA, 50 pL recombinant Proteinase K, 50 pL 1 0Ox TE and 250 pL 1 0% N-Laurylsarcosyl) for 45 min at 40°C. After this 
predigestion step [58], the samples were centrifuged, and the supernatant was removed after which an identical digestion buffer was 
applied for a full 24 hr digestion at 40°C. After this step, the samples were centrifuged and the remaining pellets were stored for later 
re-extraction. The DNA was isolated from 2 mL digested solution using a silica-powder-based extraction method. The silica suspen- 
sion was prepared by mixing 6g of Si02 with 50 mL H20. After 1 hr of sedimentation, 48 mL supernatant was transferred to a new 
50 mLtube followed by another 5 hr sedimentation. Then the top 43 mL was removed and the silica was re-suspended and activated 
with 60 pL37% HCL. To each of 2 mL digested sample, 20 mLof the binding buffer (19.54 mL QIAGEN buffer PB, 360 pL5M sodium 
acetate, 1 00 pL 5M sodium chloride) and 1 00 pL silica suspension was added and adjusted to pH 4-5 with 37% HCI [26]. After a 1 hr 
incubation at room temperature the supernatant was removed after a brief centrifugation step at 2000 g for 2 min and the pelleted 
silica was re-suspended in 1 mL binding buffer and washed twice with 80% cold ethanol. The DNA was eluted from silica particles in 
60 pL QIAGEN EB buffer. Extraction blanks were included with each round of extractions. 

NGS library preparation and sequencing of ancient samples 

DNA extracts (20 pi) were build into blunt-end libraries using lllumina-specific adapters and NEBNext DNA Sample Pre Master Mix 
Set 2 (E6070) kit according to manufacturer’s instructions with some modification mentioned below: 

Since ancient DNA molecules are already naturally fragmented, the nebulization step was skipped. The end-repair step was carried 
out in 25 pL reactions using 20 pL of DNA extract. This was incubated for 20 min at 12°C and 15 min at 37°C, and purified using PB 
buffer with QIAGEN MinElute spin columns, and eluted in 17 pi. Next, lllumina-specific adapters according to Meyer and Kircher2010 

[60] were ligated to the end-repaired DNA in 25 pL reactions. The reaction was incubated for 15 min at 20°C and purified with PB 
buffer on QIAGEN MinElute columns, before eluted in 20 pL EB Buffer. The adaptor fill-in reaction was performed in a final volume 
of 30 pL and incubated for 20 min at 65°C followed by 20 min at 80°C to inactivate the Bst enzyme. To assess the amount of DNA 
libraries in each sample and therefore the optimal number of PCR cycle for library amplification, qPCR was performed using SYBR 
green MIX (Roche) according to manufacturer’s instructions and the same forward and reverse primers used for the following index 
PCR step. The DNA library (1 2 pi) was then amplified and indexed in a 50 pL PCR reaction, mixing with 25 pL 2X Kapa U+, 1 pL of each 
primer (1 0 pM, inPE forward primer + indexed reverse primer) and 1 1 pL H 2 0. Thermocycling conditions were 45 s at 98°C, followed 
by number of cycles (based on qPCR values) of 15 s at 98°C, 30 s at 65°C and 30 sat 72°C, and a final 1 min elongation step at 72°C. 
The amplified library was purified with PB buffer on QIAGEN MinElute columns, before being eluted in 50 pL EB. Negative library con- 
trols, constructed on EB, were included, as well as libraries constructed on the negative extractions controls. 

Sequencing of aDNA 

The purified DNA libraries were quantified on an Agilent Bioanalyzer 2100. The library pools were sequenced (1 00 bp, single read) on 
lllumina HiSeq 2500 system at the Danish National High-throughput DNA Sequencing Centre. Basecalling and sequence sorting by 
sample-specific indexes were performed by the Sequencing Centre using CASAVA v.1 .8.2. 

We sequenced the 44 complete ancient mitochondrial genomes to an average depth of 1 1.7x to 159. 7x (Table SI) which allowed us 
to apply strict quality filters for calling the consensus mtDNA sequences. After re-mapping the sequencing reads of an additional 
eight published samples [26] to the rCRS, we obtained an average mtDNA depth of 7.4x to 268x for these (Table SI). 

QUANTIFICATION AND STATISTICAL ANALYSIS 

Carbon dating 

To confirm archaeological ages of the samples, 10 skeletons from different sites were radiocarbon-dated at the Department of 
Physics & Astronomy, Aarhus University (Table SI). Mean calibrated ages (years BP) with 95.4% low and high probability intervals 
of n = 10 ancient samples were obtained by OxCal (Version 4.2; Oxford Radiocarbon Accelerator Unit). 
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Basic bioinformatics 

Adaptor sequences and stretches of Ns at both ends were trimmed from ail of the ancient DNA reads using AdapterRemoval 1.5.2 
[41], keeping only sequences with a minimum length of 30 bp. The trimmed sequences were mapped against the human mitochon- 
drial genome reference (rCRS) using BWA 0.6.2 aligner [42] with the seed disabled allowing higher sensitivity [61]. The aligned se- 
quences were filtered for mapping quality 30 and sorted using Picard (http://picard.sourceforge.net) and samtools [43]. Duplicate 
sequences at library level were removed by Picard MarkDuplicates (http://picard.sourceforge.net). Read depth was determined 
by a python script genobox_bam2avgdepth1.py available at Github. We also extracted sequencing reads mapping to rCRS from 
eight previously published ancient Armenian samples [26] and analyzed them together with our 44 new samples. The consensus 
mtDNA sequence for each sample was called using an in-house perl script that considers only bases with a base quality score of 
20 and sites with a sequencing depth of at least 5X. At each position a base was called only if it was observed in at least 70% of 
the reads covering that site [62]. 

Authentication of aDNA 

DNA damage parameters, namely the C->T transition rates that are typical for aDNA [63], were estimated using mapDamage2.0 [44] 
with default settings. The levels of human DNA contamination in the ancient samples were assessed with contamMix [45]. This soft- 
ware estimates for each ancient sample how well the mtDNA reads map to the consensus mtDNA sequence of the same sample 
compared to a collection of 31 1 worldwide mitochondrial genomes. The mtDNA consensus was called with the same parameters 
(and custom perl script) as described above. 

Haplogroups 

Mitochondrial haplogroups of the ancient individuals and the modern Armenian individuals were assigned using haplogrep [46]. The 
complete consensus mtDNA sequences of the ancient individuals were aligned with the modern Armenian data and comparative 
datasets using Geneious v.9.1 .3. Genetic diversity indexes, neutrality tests and F sr genetic distances were calculated with the Arle- 
quin software package [47]. The prcomp and CA (FactoMineR package) functions in the R statistical software were used to conduct 
MDS and correspondence analysis, respectively. We used the “DAPC” R package to conduct the discriminant analysis of principal 
components (DAPC) [64]. A phylogenetic network analysis of the ancient individuals and the modern Armenians was conducted with 
POPART (http://popart.otago.ac.nz) using the TCS algorithm. 

Phylogenetic tree construction 

The maximum-parsimony phylogenetic tree of ancient and modern Armenian complete mtDNA sequences was constructed using 
the mtPhyl v4.015 software (http://eltsov.org). Since this software uses earlier version of PhyloTree (Build 1 1) as its reference phy- 
logeny, the tree was modified manually to take into account the updated mtDNA phylogeny presented in PhyloTree Build 17 [48]. 
Numbers along links refer to substitutions scored relative to the rCRS [56]. Transversions are further specified; ins and del denote 
insertions and deletions of nucleotides, respectively; back mutations are underlined; symbol < denotes parallel mutation; heteroplas- 
mies are italicized and labeled using the IUPAC code; underlined positions with question marks correspond to back mutations at 
haplogroup-specific sites whose status is unknown due to unresolved bases in certain parts of the ancient DNA sequences. For phy- 
logeny reconstruction, the nucleotide position 1 651 9 as well as positions showing point indels and/or transversions located between 
nps 16180-16193, 303-315, 522-524, 573-576 were not used. Armenian samples are labeled as in Table S2. Established haplogroup 
labels are shown in black; blue are redefined and red are newly identified haplogroups in the present study. 

Effective population size 

To uncover the trajectory of the effective female population size (N e ) through time, we applied the Bayesian Skyline plot (BSP) method 
[65] implemented in BEAST v. 1 .8.2 [49]. We constructed the BSP using both modern and ancient mitochondrial genomes in a com- 
bined dataset, using the ages of the latter as calibration points in the tree inference. 

We partitioned the alignments into four datasets with a partition scheme adapted in accordance with a recently published molec- 
ular clock calibration for human mitochondrial DNA [66]: i) First and second nucleotides in codons of protein coding genes 
(PC1+PC2), ii) third nucleotides in codons of protein coding genes (PC3), iii) rRNAs+tRNAs, and iv) HVRI+HVRII. Indels were removed 
from the alignments to avoid potential biases due to possible misalignments and incorrect consensus calling around these regions. 
Partitioning was made by an in-house Python script, available through our Github account https://github.com/Granthiov/ 
My_Python_codes. Beauti [49] was used for generating the XML input files for BEAST v. 1 .8.2. When running BEAST, we used un- 
linked strict clock rates and substitution models for the four partitions but linked the generated trees. The following substitution 
models were used based on results of jModelTest v. 2. 1.10 [51] with the Bayesian Information Criterion: HVRI+HVRII - TrN+l+G; 
PC1+PC2 -TrN+l; PC3-TrN+G; rRNAs+tRNAs - FIKY+I. Every partition was assigned an independent mutation rate prior according 
to Rieux et al., 2014 [66]. We ran the MCMC chains for 10E8 states, sampling every 10E4 states, and designating the first 10E6 states 
as burn-in. The BEAST runs were performed using the CIPRES open-access server for phylogenetics studies [67]. We checked the 
output data for convergence to a stationary distribution and sufficient effective sample size estimates using Tracer v. 1.5 [50]. 
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Demographic model testing 

We simulated data and tested five possible demographic scenarios for their fit to our observed dataset using Approximate Bayesian 
Computation (ABC). Fastsimcoal2 [52] was used to create 1 0E6 simulations for each model scenario. The summary statistics of simu- 
lated and observed data were calculated with Arlsumstat [47]. We used seven different summary statistics for the ABC analyses: 
number of polymorphic sites (S), mean number of pairwise differences (tc) and Tajima’s D (D) calculated for the ancient individuals 
and modern Armenians separately, and the F S t genetic distance between the modern and ancient groups. We combined the three 
modern Armenian sub-populations into one and for all models, a generation time of 25 years and a mutation rate of 5.5x10 7 /site/ 
generation was used [66]. The ancient individuals were simulated with the time points corresponding to their calibrated 14 C-age 
or archaeological age estimates. 

The first four models assume a bottleneck c. 800 generations ago at the end of the last glacial maximum (LGM), in accordance with 
the general consensus and our results from the skyline analysis in BEAST. In Model 1 (Genetic Continuity ), this bottlenecked maternal 
population gives rise to the ancient South Caucasus population which is directly ancestral to the modern Armenian population of size 
N m . In Model 2 (Late Separation), the initial small group gives rise to two different populations representing ancestors to our sampled 
modern and ancient groups. This divergence occurs 300-799 generations ago, sometime after the LGM and before the sampling of 
our oldest individual. In Model 3 (Complete Separation), the initial maternal population splits into two groups right after the LGM form- 
ing the modern and ancient populations, respectively. Model 4 ( Separation with Admixture) is similar to Model 3, but with an admix- 
ture event between the two lineages 200 generations ago involving 1 % to 90% of the gene pool. The time for admixture event was 
chosen around the time of population density increase and formation of the Armenian language [21 , 68]. Model 5 (Admixture with 
Asian groups) starts with a bottleneck of a non-European maternal population around 2000 generations ago (50 kya) which eventually 
gives rise to a population that is ancestral to our modern and ancient sampled groups as well as an Asian lineage (ghost population). 
The demographic relationship of the ancient group and the modern Armenian group is here similar to Model 4, but we simulate an 
additional admixture event (1 % to 90%) 50 generations ago (around the time of Arab, Mongol and Turkic invasions) from the Asian 
ghost population into the modern Armenian maternal gene pool. 

For estimating the posterior densities of all parameter values, we used ABCtoolbox 2.0 (Beta) (https://bitbucket.org/phaentu/ 
abctoolbox-public/) [53]. The same software package was used to assess the fit between the simulated data from each of our 
five different models to the observed data. This was done by calculating marginal density and Tukey depth p values with 1000 re- 
tained simulations. The former estimates the fraction of the retained simulations whose marginal density is equal or smaller than 
that of the observed summary statistics, while the latter statistic indicates the fraction of simulations whose Tukey depth is lower 
or equal to the Tukey depth of the observed data. We used the cv4postpr function implemented in the ABC R package to cross-vali- 
date the ABC model selection with 100 pseudo-observed datasets. 

DATA AND SOFTWARE AVAILABILITY 

Raw fasta files of 52 ancient and 206 modern Armenian full mitochondrial genomes have been deposited in GenBank under acces- 
sion numbers MF362692-MF362949. 
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Figure SI. Correspondence analysis based on mtDNA extended haplogroup frequencies, 
Related to Figures 2 and 3 

Ken - Kenia[Sl]; Som - Somalia[S2]; Taj - Tajikistan[S3]; Kyr - Kyrgyzstan[S3]; Uzb - 
Uzbekistan[S3]; Kor - Korea[S4]; Han - ChinaSouth Han[S5]; It - Italy(Tuscany) [S6] ; Pol - 
Poland[S7]; Bui - Bulgaria[S8]; Jor - Jordan[S9]; Leb - Lebanon[S9]; Syr - Syria[S9]; Ady - 
Adyghe[S10]; Che - Chechnya[S10]; Kab - Kabardin[S10]; Ara - Ararat (Armenian: This study); 
Erz - Erzrum (Armenian: This study); Art - Artsakh (Armenian: This study); Ancient: This study. 
Haplogroup frequencies are provided in Table S4. 
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Figure S2. mtDNA network analysis and discriminant analysis of principal components, 
Related to Figures 2 and 3 

(A) TCS network analysis based on whole mtDNA sequences of the ancient individuals (red) and modem 
Armenians (green). The major mtDNA haplogroups are indicated. The three Armenian sub-populations are 
pooled into one group for this analysis. (B) Discriminant analysis of principal components (DAPC) plot 
based on 737 complete mitochondrial genome sequences. Colored dots represent individuals while the 
ellipses are based on inertia statistics (cellipse=0.5). The two axes represent 88.5% of total variation. The 
ancient group (black ellipse) shares much of its variation with the modem Armenian groups (red, light-blue 
and green ellipses) followed by the Georgian cluster (orange). 
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Table S3. Genetic diversity indexes, ABC model choice and model choice cross-validation, Related to 
Figures 2 and 4 

(A) Genetic diversity indexes in the ancient and 3 modem Armenian groups, n - number of individuals in 
each group; s - number of polymorphic sites; k - mean number of differences between all pairs of 
sequences; n - nucleotide diversity; * p<0.001 (B) ABC model choice cross-validation based on 100 samples 
from each model. The numbers show the frequency (%) of simulations of each model (mentioned in 
columns) that were chosen by abc while using pseudo-observed summary statistics from a randomly chosen 
simulation from the five models mentioned in the rows. (B) ABC Model choice results obtained by 
abctoolbox. 


Table SI, Table S2, Table S4, Table S5 and Data SI have been uploaded as 
separate excel files. 
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